Transforming the Data Transcription and Analysis Tool Metadata and Labels into a Linguistic Linked Open Data Cloud Resource
نویسندگان
چکیده
Developing language resources requires much time, funding and effort. This is why they need to be reused in new projects and developments, so that they may both serve a wider scientific community and sustain their cost. The main problems that prevent this from happening are that (1) language resources are rarely free and/or easy to locate; and (2) they are hardly ever interoperable. Therefore, the language resource community is now working to transform their most valuable assets into open and interoperable resources, which can then be shared and linked with other open and interoperable resources. This will allow data to be reanalyzed and repurposed. In this paper, we present the first steps taken to transform a set of such resources, namely the Data Transcription and Analysis Tool’s (DTA) metadata and data, into an open and interoperable language resource. These first steps include the development of two ontologies that formalize the conceptual model underlying the DTA metadata and the labels used in the DTA to annotate both utterances and their transcriptions at several annotation levels.
منابع مشابه
Representing Multilingual Data as Linked Data: the Case of BabelNet 2.0
Recent years have witnessed a surge in the amount of semantic information published on the Web. Indeed, the Web of Data, a subset of the Semantic Web, has been increasing steadily in both volume and variety, transforming the Web into a ‘global database’ in which resources are linked across sites. Linguistic fields – in a broad sense – have not been left behind, and we observe a similar trend wi...
متن کاملThe Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud
The Open Linguistics Working Group (OWLG) brings together researchers from various fields of linguistics, natural language processing, and information technology to present and discuss principles, case studies, and best practices for representing, publishing and linking linguistic data collections. A major outcome of our work is the Linguistic Linked Open Data (LLOD) cloud, an LOD (sub-)cloud o...
متن کاملExpLOD: Summary-Based Exploration of Interlinking and RDF Usage in the Linked Open Data Cloud
Publishing interlinked RDF datasets as links between data items identified using dereferenceable URIs on the web brings forward a number of issues. A key challenge is to understand the data, the schema, and the interlinks that are actually used both within and across linked datasets. Understanding actual RDF usage is critical in the increasingly common situations where terms from different voca...
متن کاملThree Birds (in the LLOD Cloud) with One Stone: BabelNet, Babelfy and the Wikipedia Bitaxonomy
In this paper we present the current status of linguistic resources published as linked data and linguistic services in the LLOD cloud in our research group, namely BabelNet, Babelfy and the Wikipedia Bitaxonomy. We describe them in terms of their salient aspects and objectives and discuss the benefits that each of these potentially brings to the world of LLOD NLP-aware services. We also presen...
متن کاملLinguistic Linked Data in Chinese: The Case of Chinese Wordnet
The present study describes recent developments of Chinese Wordnet, which has been reformatted using the lemon model and published as part of the Linguistic Linked Open Data Cloud. While lemon suffices for modeling most of the structures in Chinese Wordnet at the lexical level, the model does not allow for finergrained distinction of a word sense, or meaning facets, a linguistic feature also at...
متن کامل